Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the Iceberg integration to better recover from auth/metadata issues by refreshing the REST catalog client on auth-related failures, adding retry behavior around table operations, and upgrading to apache/iceberg-go v0.5.0.
Changes:
- Upgrade
github.com/apache/iceberg-gotov0.5.0(and related dependency bumps). - Add catalog refresh + retry-on-auth-error behavior in
catalogx.Clientmethods, with new unit tests. - Add writer/committer reload paths and retry behavior for write/commit operations.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/impl/iceberg/router.go | Adjusts write retry behavior and wires a table reload callback into the committer. |
| internal/impl/iceberg/committer.go | Adds table reload on commit retry/failure so future commits use fresh metadata/auth. |
| internal/impl/iceberg/catalogx/catalog.go | Implements auth-error-triggered catalog refresh and a refresh coordinator. |
| internal/impl/iceberg/catalogx/catalog_test.go | Adds tests validating retry-on-auth behavior and concurrent refresh behavior. |
| go.mod | Bumps iceberg-go and several dependencies (including x/sync for semaphore). |
| go.sum | Updates dependency checksums for bumped modules. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Commits Review
|
This can happen if auth expires, etc, and forcing a refresh and retry is a good idea (the snowflake connector does this too).
|
Commits Review
|
| url string | ||
| opts []rest.Option | ||
| namespace []string | ||
| mu *semaphore.Weighted |
There was a problem hiding this comment.
Could this be renamed to avoid confusion, also note why we have a Semaphore would be helpful.
There was a problem hiding this comment.
Pulled it out into a syncx package and added comments to explain the rationale. PTAL
| if isAuthErr(err) { | ||
| if refreshErr := c.refreshCatalog(ctx); refreshErr != nil { | ||
| return nil, fmt.Errorf("refreshing catalog during appending data files %w: %v", err, refreshErr) | ||
| } | ||
| } |
There was a problem hiding this comment.
To reduce code duplication we could have refreshCatalogOnAuthErr() that would handle this uniformly.
There was a problem hiding this comment.
Done - it's not always that simple because we retry somethings
|
|
||
| func (c *Client) refreshCatalog(ctx context.Context) error { | ||
| if !c.mu.TryAcquire(1) { | ||
| // In this case someone else is trying to refresh the catalog, |
There was a problem hiding this comment.
I'm trying to understand what is the benefit compared to RW mutex?
There was a problem hiding this comment.
Because the context can be cancelled and that is respected. When doing IO and the lock can be held for a while it's still nice to support context cancellation
It uses mainline of iceberg-go: apache/iceberg-go@6842055 With the following PRs applied on top: - apache/iceberg-go#795 - apache/iceberg-go#803
There are a few issues where iceberg go does not refresh authentication, so we do that when we interact with the library by agressively retrying.
Upstream fixes: